An Automatic Approach for Translating Simple Images into Text Descriptions and Speech for Visually Impaired People

نویسندگان

  • Mrunmayee Patil
  • Ramesh Kagalkar
  • Girish Kulkarni
  • Visruth Premraj
  • Vicente Ordonez
  • Sagnik Dhar
  • Siming Li
  • Yejin Choi
  • Alexander C. Berg
  • Tamara L. Berg
  • Benjamin Z. Yao
  • Xiong Yang
  • Liang Lin
  • Mun Wai Lee
  • Song-Chun Zhu
  • Fan-Chieh Cheng
  • Shih-Chia Huang
  • JAMES Z. WANG
  • Munawar Hayat
  • Mohammed Bennamoun
  • Mina Makar
  • Sam S. Tsai
  • David Chen
چکیده

Image processing is a rapidly growing field of research. Images are of different file formats and of different things, places, humans, scientific, astrological and many such. An image is a collection of several pixels arranged in rows and columns. These images are captured, processed and stored for various uses. For common people it is very easy to identify and analyze general images but for the blind and physically disabled people it is difficult. Unfortunately, there is no prior medium or interface for such needy people to communicate with the world. Blind or visually impaired people are usually those people who are neglected by the society, so there is always a need to help such people. Hence, we propose a new technique of converting images into text as well as speech using techniques provided by image processing like pre-processing, image segmentation, edge detection, object detection and speech synthesis. In this paper we first introduce image to text conversion need for blind people and system overview of image to text and speech conversion system. Edge detection plays an important role in this system where Canny edge detection algorithm is used to detect objects from images. Object recognition is done on the basis of color, size, texture and shape of the object.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Haptic Browser: A Haptic Environment to Access HTML Pages

The application presented in this paper aims at producing a novel user-friendly haptic environment to allow blind or visually impaired people to access interactive presentations based on HTML web pages. The application is based on haptic and audio feedback. Additionally, an automatic HTML-to-haptics conversion tool is developed in order provide a simple way to create interactive haptic presenta...

متن کامل

From Image to XML: Monitoring a Page Layout Analysis Approach for the Visually Impaired

Page layout analysis and the creation of an XML document from a document image are useful for many applications including the preservation of archived documents, robust electronic access to printed documents, and access to print materials by the visually impaired. In this paper, the authors describe a document image process pipeline comprised of techniques for the identification of article head...

متن کامل

Audiodescription research: state of the art and beyond

Audiodescription (AD) is a growing arts and media access service for visually impaired people. As a practice rooted in intermodal mediation, i.e. 'translating' visual images into verbal descriptions, it is in urgent need of interdisciplinary research-led grounding. Seeking to stimulate further research in this field, this paper aims to discuss the major dimensions of AD, give an overview of com...

متن کامل

Improvement of generative adversarial networks for automatic text-to-image generation

This research is related to the use of deep learning tools and image processing technology in the automatic generation of images from text. Previous researches have used one sentence to produce images. In this research, a memory-based hierarchical model is presented that uses three different descriptions that are presented in the form of sentences to produce and improve the image. The proposed ...

متن کامل

Kannada Text Extraction from Images and Videos Forvision Impaired Persons

We propose a system that reads the Kannada text encountered in natural scenes with the aim to provide assistance to the visually impaired persons of Karnataka state. This paper describes the system design and standard deviation based Kannada text extraction method. The proposed system contain three main stages text extraction, text recognition and speech synthesis. This paper concentrated on te...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015